Search CORE

11 research outputs found

LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech

Author: Alisamir Sina
Allauzen Alexandre
Besacier Laurent
Boito Marcely Zanon
Coavoux Maximin
Dinarelli Marco
Esteve Yannick
Evain Solene
Goulian Jerome
Le Hang
Lecouteux Benjamin
Mdhaffar Salima
Nguyen Ha
Parcollet Titouan
Portet Francois
Pupier Adrien
Ringeval Fabien
Rossato Solange
Rouvier Mickael
Schwab Didier
Tomashenko Natalia
Zhang Shucong
Publication venue
Publication date: 11/09/2023
Field of study

Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different domains including computer vision and natural language processing. Speech processing drastically benefitted from SSL as most of the current domain-related tasks are now being approached with pre-trained models. This work introduces LeBenchmark 2.0 an open-source framework for assessing and building SSL-equipped French speech technologies. It includes documented, large-scale and heterogeneous corpora with up to 14,000 hours of heterogeneous speech, ten pre-trained SSL wav2vec 2.0 models containing from 26 million to one billion learnable parameters shared with the community, and an evaluation protocol made of six downstream tasks to complement existing benchmarks. LeBenchmark 2.0 also presents unique perspectives on pre-trained SSL models for speech with the investigation of frozen versus fine-tuned downstream models, task-agnostic versus task-specific pre-trained models as well as a discussion on the carbon footprint of large-scale model training.Comment: Under submission at Computer Science and Language. Preprint allowe

arXiv.org e-Print Archive

End-to-End Dependency Parsing of Spoken French

Author: Coavoux Maximin
Goulian Jérôme
Lecouteux Benjamin
Pupier Adrien
Publication venue: HAL CCSD
Publication date: 01/04/2022
Field of study

International audienc

Hal - Université Grenoble Alpes

Une chaîne de traitements pour la simplification automatique de la parole et sa traduction automatique vers des pictogrammes

Author: Macaire Cécile
Ormaechea Grijalba Lucia
Pupier Adrien
Publication venue: 'Associacio catalana de Salut Laboral'
Publication date: 01/01/2022
Field of study

La Communication Alternative et Augmentée (CAA) prend une place importante chez les personnes en situation de handicap ainsi que leurs proches à cause de la difficulté de son utilisation. Pour réduire ce poids, l’utilisation d’outils de traduction de la parole en pictogrammes est pertinente. De plus, ils peuvent être d’une grande aide pour l’accessibilité communicative dans le milieu hospitalier. Dans cet article, nous présentons un projet de recherche visant à développer un système de traduction de la parole vers des pictogrammes. Il met en jeu une chaîne de traitement comportant plusieurs axes relevant du traitement automatique des langues et de la parole, tels que la reconnaissance automatique de la parole, l’analyse syntaxique, la simplification de texte et la traduction automatique vers les pictogrammes. Nous présentons les difficultés liées à chacun de ces axes ainsi que, pour certains, les pistes de résolution

Archive ouverte UNIGE

Une chaîne de traitements pour la simplification automatique de la parole et sa traduction automatique vers des pictogrammes

Author: Macaire Cécile
Ormaechea-Grijalba Lucia
Pupier Adrien
Publication venue: 'Associacio catalana de Salut Laboral'
Publication date: 27/06/2022
Field of study

National audienceLa Communication Alternative et Augmentée (CAA) prend une place importante chez les personnes en situation de handicap ainsi que leurs proches à cause de la difficulté de son utilisation. Pour réduire ce poids, l’utilisation d’outils de traduction de la parole en pictogrammes est pertinente. De plus, ils peuvent être d’une grande aide pour l’accessibilité communicative dans le milieu hospitalier. Dans cet article, nous présentons un projet de recherche visant à développer un système de traduction de la parole vers des pictogrammes. Il met en jeu une chaîne de traitement comportant plusieurs axes relevant du traitement automatique des langues et de la parole, tels que la reconnaissance automatique de la parole, l’analyse syntaxique, la simplification de texte et la traduction automatique vers les pictogrammes. Nous présentons les difficultés liées à chacun de ces axes ainsi que, pour certains, les pistes de résolution

Hal - Université Grenoble Alpes

Une chaîne de traitements pour la simplification automatique de la parole et sa traduction automatique vers des pictogrammes

Author: Macaire Cécile
Ormaechea-Grijalba Lucia
Pupier Adrien
Publication venue: 'Associacio catalana de Salut Laboral'
Publication date: 01/01/2022
Field of study

Hal - Université Grenoble Alpes

Hal-Diderot

Archive ouverte UNIGE

End-to-End Dependency Parsing of Spoken French

Author: Coavoux Maximin
Goulian Jérôme
Lecouteux Benjamin
Pupier Adrien
Publication venue: 'International Speech Communication Association'
Publication date: 18/09/2022
Field of study

International audienceResearch efforts in syntactic parsing have focused on written texts. As a result, speech parsing is usually performed on transcriptions, either in unrealistic settings (gold transcriptions) or on predicted transcriptions. Parsing speech from transcriptions, though straightforward to implement using out-of-the-box tools for Automatic Speech Recognition (ASR) and dependency parsing has two important limitations. First, relying on transcriptions will lead to error propagation due to recognition mistakes. Secondly, many acoustic cues that are important for parsing (prosody, pauses,. . .) are no longer available in transcriptions. To address these limitations, we introduce wav2tree, an end-to-end dependency parsing model whose only input is the raw signal. Our model builds on a pretrained wav2vec2 encoder with a CTC loss to perform ASR. We extract token segmentation from the CTC layer to construct vector representations for each predicted token. Then, we use these token representations as input to a generic parsing algorithm. The whole model is trained end-to-end with a multitask objective (ASR, parsing) to reduce error propagation. Our experiments on the Orféo treebank of spoken French show that direct parsing from speech is feasible: wav2tree outperforms a pipeline approach based on wav2vec (for ASR) and FlauBERT (for parsing)

Hal - Université Grenoble Alpes

PROPICTO: Developing Speech-to-Pictograph Translation Systems to Enhance Communication Accessibility

Author: Bouillon Pierrette
Coavoux Maximin
Gerlach Johanna
Goulian Jérôme
Lecouteux Benjamin
Macaire Cécile
Mutal Jonathan
Norré Magali
Ormaechea Lucía
Pupier Adrien
Schwab Didier
Publication venue: HAL CCSD
Publication date: 01/06/2023
Field of study

International audiencePROPICTO is a project funded by the French National Research Agency and the Swiss National Science Foundation, that aims at creating Speech-to-Pictograph translation systems, with a special focus on French as an input language. By developing such technologies, we intend to enhance communication access for non-French speaking patients and people with cognitive impairments

Hal - Université Grenoble Alpes

PROPICTO: Developing Speech-to-Pictograph Translation Systems to Enhance Communication Accessibility

Author: Bouillon Pierrette
Coavoux Maximin
Esperança-Rodier Emmanuelle
Gerlach Johanna
Goulian Jerôme
Lecouteux Benjamin
Macaire Cécile
Mutal Jonathan
Norré Magali
Ormaechea Grijalba Lucia
Pupier Adrien
Schwab Didier
Publication venue: European Association for Machine Translation
Publication date: 01/01/2023
Field of study

PROPICTO is a project funded by the French National Research Agency and the Swiss National Science Foundation, that aims at creating Speech-to-Pictograph translation systems, with a special focus on French as an input language. By developing such technologies, we intend to enhance communication access for non-French speaking patients and people with cognitive impairments

DIAL UCLouvain

PROPICTO: Developing Speech‑to‑Pictograph Translation Systems to Enhance Communication Accessibility

Author: Bouillon Pierrette
Coavoux Maximin
Esperança-Rodier Emmanuelle
Gerlach Johanna
Goulian Jérôme
Lecouteux Benjamin
Macaire Cécile
Mutal Jonathan David
Norré Magali
Ormaechea Lucía
Pupier Adrien
Schwab Didier
Publication venue: HAL CCSD
Publication date: 12/06/2023
Field of study

Hal - Université Grenoble Alpes

PROPICTO : Développer des systèmes de traduction de la parole vers des séquences de pictogrammes pour améliorer l’accessibilité de la communication

Author: Bouillon Pierrette
Coavoux Maximin
Esperança-Rodier Emmanuelle
Gerlach Johanna
Goulian Jerôme
Lecouteux Benjamin
Macaire Cécile
Mutal Jonathan
Norré Magali
Ormaechea Grijalba Lucia
Pupier Adrien
Schwab Didier
Spechbach Hervé
Publication venue
Publication date: 01/01/2023
Field of study

PROPICTO est un projet financé par l'Agence nationale de la recherche française et le Fonds national suisse de la recherche scientifique, qui vise à créer des systèmes de traduction de la parole vers des pictogrammes avec le français comme langue d'entrée. En développant de telles technologies, nous avons l'intention d’améliorer l'accès à la communication pour les patients non francophones et les personnes souffrant de troubles cognitifs

DIAL UCLouvain